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Summary of Progress 


During the period August 1, 1989 - January 31, 1D90, progress was made in the following 
areas: 

1) Performance Analysis of Bandwidth Efficient Trellis Codes 

Two methods have traditionally been employed to analyse the performance of various 
coding schemes. One method bounds the achievable free distance of particular classes of 
codes since free distance is the most important parameter that influences the performance 
of a code. The other method uses a random coding approach to direct y oun e aver 
lor probability of an ensemble of codes. The best codes are then known to perform at least 
as well as the bound. This method is the one originally taken by Shannon. 

Most of the performance analyses published for trellis coded modulation (TCM) scheme 
have used the first method, i.e„ to bound the achievable free distance of particular classy 
of codes. We have just completed a new analysis of TCM schemes which uses e 
coding approach. A paper summarizing these results has been submitted for 
to the IEEE Transactions on Information Theory [1]. A copy of this paper is inc u 
Appendix A of this report. The most interesting aspect of this paper is that the cutoff rate 
ft, of the channel is shown to be the most important factor determining the performance 
of TCM schemes. This fact can be used to find signal constellations which maximize 
performance of a particular class of codes when combined with an appropriate mapping. 

P We have also continued our work on the performance analysis of concatenation schem 
with TCM inner codes and Reed-Solomon (RS) outer codes. Our previous work on this 
problem, summarized in earlier reports submitted to NASA and detailed in several Mourna 
and conference publications, used an approach of simulating the performance of the inner 
code and then using RS code bounds to determine overall performance. This approac was 
necessitated by the fact that all previous performance bounds for TCM schemes tr f* e on y 
the bit error probability, whereas for concatenation schemes the symbol error probability o 

the inner code is the parameter of interest. > 

We have now developed a new bound on the symbol error probability of trellis codes 
\ summarv of this work, which was recently presented at the 1990 IEEE Internationa 
Symposium on Information Theory [2], is included as Appendix B of this report. Using is 
new bound, we are now able to do a complete analysis of TCM/RS concatenation schemes 
without resorting to simulations. This will allow us to examine the performance of a much 
treater variety of possible concatenation schemes, since simulation studies are particularly 
difficult and time consuming for TCM codes. Mr. Lance Perez, a Ph.D. student supported 
by the grant, is conducting this phase of our research. We plan to submit a paper for 
publication on this new bound in the near future. 
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2) Construction of Bandwidth Efficient Trellis Codes 

In our annual status report of October 1989, we included the final version of a full length 
paper in which a large number of new trellis codes were constructed. Most of these codes 
used multi-dimensional (multi-D) 4-PSK, 8-PSK, and 16-PSK signal constellations, although 
new codes for two-dimensional (2-D) signal constellations were also given. We have since 
begun work on the construction of two new classes of trellis codes: 

a) Nonlinear 2-D trellis codes which are fully invariant to discrete rotations of the PSK 
signal set. 

b) Multi-D trellis codes for QAM signal sets. 

Rotational invariance is a desirable feature for TCM schemes. Rotationally invariant 
codes have the property that if the demodulator locks onto the wrong phase of the received 
signal, the decoder will suffer only a slight degradation in performance. (This also assumes 
the use of differential encoding and decoding.) This is particularly important in applications 
where the traffic (or the channel) is bursty, thereby causing the demodulator to periodically 
reacquire phase lock. Unfortunately, no 2-D linear convolutional code can be fully invariant to 
discrete phase rotations of the signal set. This is one of the motivating factors in considering 
multi-D signal sets, where it is possible to find linear codes with full rotational invariance. On 
the other hand, 2-D TCM schemes are much simpler to implement than multi-D schemes and 
are often required for this reason. This led us to the construction of nonlinear convolutional 
codes for 2-D signal sets which have full rotational invariance. In general, there is a small 
price in performance to be paid to guarantee rotational invariance in the 2-D case. A 
summary of our new nonlinear codes, presented at the 1989 IEEE Workshop on Information 
Theory [3], is included as Appendix C of this report. This work is being conducted by Mr. 
Steven Pietrobon, a Ph.D. student supported by the grant. A full length paper is being 
prepared for submission in the near future which will contain an extensive list of nonlinear 
rotationally invariant codes for 8-PSK and 16-PSK signal constellations. 

In some applications, constant amplitude signals such as PSK may not be required. In 
this case, other signal constellations such as QAM can be considered. We have extended our 
constructions of multi-D TCM codes to the QAM case. Generally, better performance can 
be obtained with QAM than with PSK because there is more flexibility in assigning signal 
points, thereby making it possible to achieve larger free distances with the same average signal 
energy. A brief summary of our new QAM code constructions, recently presented at the 1990 
IEEE International Symposium on Information Theory [4], is included as Appendix D of this 
report. This work is being performed by Mr. Steven Pietrobon, a Ph.D. student supported 
by the grant. A full length paper is being prepared for submission in the near future which 
will contain extensive lists of multi-D codes for a variety of QAM signal constellations. 
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3) Sequential Decoding of Trellis Codes 

One of the major thrusts of our future research efforts under the grant will be the devel- 
opment of suboptimum decoding methods for TCM schemes. Optimum (Viterbi) decoding 
can only be used to obtain moderate error rates on the order of 10 10 on many c an 

nels. To obtain lower error rates would require the use of prohibitively complex ^deco ers 
(long constraint or block lengths). Therefore to achieve error rates in the range 10 - 10 ' 

will require the use of longer codes and suboptimum (but still very good) decoding met ho s 
which are insensitive to code constraint (block) length. (Another approach to the problem of 
achieving lower error rates than can be obtained with Viterbi decoding is to use concatenate 

coding, which is under continuing investigation.) . 

Sequential decoding has long been recognized as a nearly optimum decoding method 
whose complexity is insensitive to code constraint length. Therefore sequential decoding can 
be used with large constraint length codes. One major problem with sequential decoders, 
however, is that long searches are occasionally necessary, and this may result in some lost or 
erased data. Therefore, in order to fairly compare sequential decoding with Viterbi decoding, 
it is necessary to account for the erasures in some way, since Viterbi decoders never erase 

any information. ,. ... 

We have begun the development of an erasurefree version of sequential decoding whic 

can be directly compared to Viterbi decoding. Some preliminary results of this work which 
were presented at the 1990 IEEE International Symposium on Information Theory [5], are 
included as Appendix E of this report. Our erasurefree sequential decoding algorithm, called 
the buffer looking algorithm (BLA), appears to perform quite well. Simulation results show 
that its performance with a constraint length 13, rate 2/3, 8-PSK trellis code is about ldB 
superior to Viterbi decoding of a constraint length 8, rate 2/3, 8-PSK trellis code at a decoded 
error probability of 10 -5 . At lower error rates, we would expect the relative performance 
of the sequential decoder to be even better. A complete comparison of the performance, 
complexity, and delay of sequential decoding and Viterbi decoding of trellis codes will be the 
subject of future reports, but the preliminary results look very encouraging. Mr. Fu-Quan 
Wang, a Ph.D. student supported by the grant, is conducting our research on sequential 
decoding. Dr. Daniel J. Costello, Jr., the principal investigator on the grant, has been asked 
to give an invited lecture on this research at the 1990 IEEE Information Theory Workshop 
to be held in Eindhoven, The Netherlands, in June. 
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Abstract 

This paper presents an expurgated upper bound on the event error probability of 
trellis coded modulation. This bound is used to derive a lower bound on the minimum 
achievable free Euclidean distance dj ree of trellis codes. It is shown that the domi- 
nant parameters for both bounds, the expurgated error exponent and the asymptotic 
djree growth rate, respectively, can be obtained from the cutoff-rate Rq of the trans- 
mission channel by a simple geometric construction, making Ro the central parameter 
for finding good trellis codes. Several constellations are optimized with respect to the 
bounds. 


*This work was supported by NASA Grant NAG5-557 and NSF Grant NCR89-03429. 
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I. Introduction 

In recent years bandwidth efficient trellis coded modulation (TCM) has become increasingly 
popular and much analysis has been devoted to the performance of these coding schemes 
on AWGN-channels (see [1-5] and the references therein). It is well known that for la g 
signal- to-noise ratio (SNR), the minimum free Euclidean distance d free of a trellis core is 
the dominant parameter of a code’s performance. Much research has gone into the search 
for and the construction of codes with large d free . While most of this work has focused on 
finding good trellis codes with a given signal constellation, the constellation itself is a so 
a parameter in the system design. There have been a few attempts to design codes using 
non-standard signal constellations, like the asymmetric MPSK signa sets introduced in [ ]• 
These codes showed slight performance improvements, but no general rule on how to c oose 

a constellation is known. „ 

In this paper we show that a signal constellation with a good value ol the cutoff-rate flo 

[7] will indicate the existence of codes with good dj ree and good performance. 11 s is one 
by calculating an expurgated upper bound on the first event error probability of a trellis 

code and relating it to d 

A code’s minimum free Euclidean distance d fr€e l is often used to obtain an estimate of 
the code’s error performance as follows: 


P e ~ Tljree Q df T 


IE, I N 0 


where n is the path multiplicity of the code, i.e., the number of error events with distance 
d free , and Q{x) = f£° 1/v^ir exp(x 2 /2)c?x. This approximation provides a good asymptotic 

estimate of a code’s performance. 

This paper is organized in the following way. Section II describes TCM and the definitions 
used later In Sections III and IV we derive a random coding bound and an expurgated 
bound on the first event error probability of TCM. The casual reader may want to skip this 
derivation and proceed directly to Theorem 1 in Section IV. In Section V we present a strict 
lower bound on the event error probability involving d /ree , and, relating it to the expurgated 
upper bound, we rederive the lower bound on d free originally presented by Rouanne and 
Costello [8]. In Section VI we develop a geometric approach to constructing the bounds and 
determine a number of optimized constellations. Section VII contains the conclusions. 


II. Trellis Coded Modulation 

A general TCM communication system (Figure 1) consists of a trellis encoder, a modulator, 
the transmission channel, a demodulator, and a trellis decoder. The structure of a trellis 
code is generated by a binary convolutional encoder, which is a finite state automaton with 
2" possible states, where v is the total memory of the encoder. In the minimal realization 
[9], the encoder consists of k feedback free shift register chains of lengths v x , . . . , u- k . V e 

iNote that all Euclidean distances are normalized, i.e., they are based on unit energy signal constellations. 
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assume in this papers that v x = v 2 = • • • = v- k = i/ m , where v m is the memory length of the 
code. It then follows that the shortest non-zero path has length // = v m + 1. g is called the 
constraint length of the code. An extension to different values of J', is generally possible but 
messy, and does not seem to provide any additional insight. At each time interval r, the 
encoder accepts k binary input bits (u k , u* -1 , . . . , itj) and makes a transition from its state 
S r at time r to one of 2 k possible successor states S r +i at time r + 1. 

The h = n — (k — k) output bits from the convolutional encoder and k — k uncoded 
information bits (u k , . . . , ujj +1 ) form one of 2” binary n-tuples v r = (u" , u" • • • •. v r ) > called 
a signal selector. The sequence V = (u 1} ...,u/) of signal selectors is the label of a path 
through a linear trellis 2 , generated by the convolutional encoder. v T is then mapped into 
x T , one of A — 2 n d-dimensional channel symbols from a signal set A = {« i,ct 2 , • • • < a A] °f 
cardinality A. 

The uncoded information bits do not affect the state of the convolutional encoder and 
cause 2 k ~' k parallel transitions between the encoder states S T and 5 r +i. A rate R = k/n 
trellis code transmits k bits/channel signal. 

In practical systems, one often uses 2-dimensional (complex) signal sets for their ease of 
implementation, and the real part and imaginary part of x T drive the direct and quadrature 
component of the modulator. 

III. A Random Coding Bound for Time Varying Trellis Codes 
on General Memoryless Channels 

In this section we derive an expurgated upper bound on the event error probability of a trellis 
code. The derivation is similar to that given in Viterbi and Omura [10] for convolutional 
codes. Throughout the derivation we assume that the codes are used in conjunction with 
a maximum-likelihood decoder that operates on a decoding metric m(x,y), where x = 
(x!, . . . , x/) is a sequence of transmitted symbols X{ and y = (t/i, . . . , t/;) is the corresponding 
received symbol sequence. By convention, the signal x with the lowest metric is the most 
reliable, i.e., m(x,y) is some non-negative function of x given y, which is inversely related to 
the conditional probability that x was transmitted given that y was received. The decoder 
then chooses the message sequence x for which this metric is minimized. It makes an error 
if it decodes a sequence x', given that the correct sequence, i.e., the transmitted sequence, 
was x. This happens if m(x',y) < m(x, y). 

Let V and V' be labeled paths through the trellis, i.e., V and V' describe trellis paths 
without signals assigned to them. We refer to V as the correct path if it is the one followed 
by the encoder. Let V' be a path that diverges from V at node j. We call V an incorrect 
path. Further, let V be the set of all incorrect paths V' that diverge from V at node j. The 
paths V eventually remerge with V and we call the number of branches over which V and 
V' differ the length of V' . Due to the linearity of the labeling, the sets V' for different correct 
paths V are equivalent, i.e., they contain the same number of paths of the same lengths. In 

2 Here linear means that if the binary output sequence V of the convolutional encoder is used to label a 
path in the trellis, the modulo-2 sum of two labels is a label for another valid path. 
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a particular trellis code, let x be the sequence of signals assigned to the correct path V , and 
let x' be the sequence of signals assigned to V'. 

Our goal is to obtain an upper bound on the first event error probability Pfij), the 
probability that the decoder starts an error event at node j. An error event starts at node 
j if the decoder chooses an incorrect path V' with its associated signal sequence x over the 
correct path V with signal sequence x starting at node j, as illustrated in Figure 2. 

A necessary but not sufficient condition for such an error event to occur is that the 
incorrect path V accumulates a smaller total metric than the correct path V over their 
unmerged segments or time intervals of the trellis. The probability P e {j) rnay then be upper 
bounded by the probability that any path V' G V' diverging from the correct path V at node 
j accumulates a lower total metric than the correct path V . This probability must then be 
averaged over all correct paths V . Letting p{V ) denote the probability of path V" . we obtain 

PeU) < £p(U) IX yl x ) 1 1 U l/ '(m(x',y) - rn(x,y) < 0) 1 , (1) 

v y { v'ev J 

where U'(m(x',y) -m(x,y) < 0) is a path € V for which m(x', y) - m(x. y) < 0. and 1(B) 
is a set indicator function such that X(B) = 0 if 5 = 0, the empty set, and X{B) = 1 if 
B / 0. p(y|x) is the conditional probability of receiving sequence y if the encoder follows 
path V and transmits the signal sequence x. This conditional probability depends on the 
particular channel over which the sequences are transmitted. 

If the received signal sequence y consists of real valued symbols, rather than discrete 
signal points (unquantized decoding), the summation in (1) is replaced by an integration 
over the space of y, i.e., 

pfij) < £p(V) / p(yl x ) J | U l "( m ( x '>y) - m ( x ^y) < °) | d y- ( 2 ) 

v J y { vev J 

It is, in general, too difficult to evaluate (1) or (2) exactly and we therefore resort further 
bounding techniques. Using the inequality J{(J, 5,} < T»X{B t ), we may immediately 
simplify (2) to obtain an upper bound of the form: 

P e {j) < 52 p(V) j p(y | x) X{V\m{x!, y) - m(x,y) < 0)}dy. (3) 

v Jy V'eV' 

In order for an incorrect path V' to merge with the correct path V at node j + /, the 
last entries in the information sequences u ,1 ,...,u / * associated with V must equal the 
last v m entries in the information sequences u x ,...,u fc associated with V, i.e., u' r l = u ! r for 
r E {j + l — v m , j + l — 1} and i = 1,2, • • • , k. That this is the case can easily be seen 
bv noting that in order for the two paths V' and V to merge at node j + /. their associated 
encoder states must be identical. Because an information bit entering the encoder can affect 
the output for v m time units, this is also the time it takes to force the encoder into any 
given state from any arbitrary starting state; in particular, to have V' join V at node j + /. 
Because the remaining information bits u(? for r G {j, ■ . ■ , j + / — p} are arbitrary, we have 
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M < (2 fc - l)2 i( '- M) incorrect paths V ' of length /. (Note that the choice of the information 
bits at r — j is restricted because we stipulated that the incorrect path diverges at node j , 
which rules out the one path that continues to the correct state at node j + 1. This accounts 

for the term 2 k - 1 in the expression for M.) 

We now proceed to evaluate f y p(y\x)Z{V'(m(x',y)-m(x,y) < 0)} for a particular path 

pair (V', V) of length /. Let us write (3) as 

Pe(j) < E Pr ( x (4) 

V V'zV 


where 

P(x-x') = / p(y|x) J{V # (m(x , ,y) — m(x,y) < 0)}dy 

= £ ylx [I{V'(m(x'y) - m(x,y) < 0)}] , 

and E y \ x denotes conditional expectation. We now use the Chernoff bounding technique [1 1] 
and overbound X\ot < 0] by exp( — Xa) to obtain 

P(x — *■ x') < Ey j x [exp( — A{m(x',y) — m(x,y)})] = C(x,x . A), 

where A is a non-negative real valued parameter over which C(x,x',A) is minimized to 
obtain the tightest possible bound. We call C(x, x', A) the Chernoff bound between the 
signal sequences and x. 

We now express (4) as the sum over individual sequences of length l 

OO 

p.O) < 2>(r)£ E c(x,x',i) 

V l=n V’e V/ 

OO 

= D E pW E C(x,x',A), (5) 

i-nVi€Vi vyev; 

where V; is the set of all correct paths Vi of length l starting at node j and V/ is the set of 
all incorrect paths V( of length / unmerged with Vi from node j to node j + l. Note that 

u, v; = v'. 

P e {j) is the event error probability of a particular code since it depends on the signal 
sequences x and x' of the code. The aim of this section is to obtain a bound on an ensemble 
of trellis codes, and we therefore must average over the event error probabilities of all the 
codes in the ensemble, i.e., 


OO 

PeU) <EE pW) E C(x,x',A), (6) 

v,€V, v/ev; 

where the overbar denotes an ensemble average. ^ 

Using the linearity of the expectation operator and noting that there are exactly A 2 
equiprobable paths in V;, because at each time interval there are 2 k possible choices to 
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continue the correct path, we obtain 


Pe{j) < E C(x- X/ .A) 

t=n ~ bev, v/ev; 


XXi). 

l=d 


( 7 ) 


where we have implicitly defined 7T/(j). 

We will now proceed to evaluate 7T/(j). Let Xi, • • ■ ,X;v be a set of possible correct signal 
sequences associated with the paths V\ 6 V/ as we go through the codes in the ensemble and 
let Q/w( x i> * ‘ * i x iv) be their probability of occurrence. Note that there are M incorrect paths 
V/ g V[ with signal sequences xi, • • • , x' A/ that spread around each correct path V\. Because 
each incorrect path in V/ is also a possible correct path V\ of length /, we have V/ C V/. 
Averaging over all codes in the ensemble is the same as averaging over all possible signal 
sequences in these codes, i.e., over all assignments of signal sequences x to paths V . We then 
obtain 


N M 


*l(j) < ^7 H ’ * ’ 5Z?w ( X U * * ' > X A r ) X! X <’ ^ 


1 


Xi x N 

N M 


h= 1 i=l 


~ h=l t = l x' 


(8) 


where in the last step we have summed over all pairs of sequences x,x x^,x t . We have 
now obtained a bound where we can limit our attention to one correct signal sequence x^ 
and one incorrect signal sequence x^ both of length /. 

In order to proceed further, we will now restrict our attention to memoryless channels. 
On a memoryless channel, the metrics become additive over the individual time units, i.e., 

i 

m(x,y) = ]Tm(x r ,y r ). 

r— 1 


This allows us to rewrite (5) as 

/ i 

C(x,x',A) = C{x r ,x' r ,\) = P E yr{Xr [exp(-A{m(x',y r ) - m(x r , y r )})] , 

r=l r=l 

where C{x r ,x f r , A) is the Chernoff factor between the signals x f r and x r . 

We now assume further that in composing our code ensemble, each individual signal in 
each sequence is chosen independently according to a common probability distribution < 7 ( 2 ), 
i.e., q,{x h ) = n' r= i9(^r) and qi{x'i\x h ) = nLi respectively. In order to make this 

possible we must assume that the trellis codes are time- varying in nature, for otherwise each 
symbol would also depend on the choices of the v m last symbols. We now obtain a much 
simpler version of the above bound, namely 

*i(j) ^ i E E E E II a). (9) 

Z h= 1 t=l X x' r=l 
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Because the choice of the signals x r and x f r does not depend on the particular signal sequences 
Xh and x', we dropped the dependency on h and i in (9). Upon interchanging multiplication 
and summation we obtain 


N M l 


*iU) < ^7 Y Y II YY <l(xr)q{x' r )C(x r , x' r , A). 


( 10 ) 


h=\ i=l r=l x r x' 


The signals x r ,x(. are chosen randomly from the signal set A = {aj, . . . ,a^}, where p(a p ) is 
the probability of choosing a p , i.e., q(x r ) = p(a p ) if x r — a p . We may now rewrite (10) as 


A r m i A A 


*i(j) < ^jEEII E EpKM^k^p^) 


h~l i=l r= 1 m=l p= 1 
1 V M 1 A A 


S ^P(«m)p(a P )C(a m ,a p ,A) 


/l=l t = l \m = l p=l 

/ A A 


< (2‘ - ^ ^p(o«)p(<!,)C( 0 „,a„A) 

\m = l p=l > 


Let us now define i? 0 (p) as 


A A 


Ro(p) - - log 2 min Y Y* P( a m)p(op)C(a m , a p , A). 


(11) 


m = 1 p=l 


We may now finally evaluate the average event error probability P e (j) at time unit j as 


-P.(i) <E£,»iW 

CO 

< _ ^9— Mflp(p) 9**9-*/^) 


5=0 


(2* — 

1 _ 2-(«o(p)- fc ) 


; 0 < A; < Ro(p). 


Since k is the number of information bits transmitted in one channel symbol x r , we may 
call it the information rate in bits per channel use and denote it by the symbol R . P € (j ) is 
independent of the node j and we may thus drop the parameter j and obtain 


(2 r — 

Pe — l Z 2-(Ho(p)-H) ’ 0 < R < Rq(p )• 


( 12 ) 


The parameter 


( A A 

-log 2 min Y Y P{ a m)p(a P )C{a m , a p , A) ) (13) 

A m= 1 p=l 
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is the cutoff-rate of the channel and (12) holds for all rates R < Ro(p)- We will later use the 
uniform distribution p = l/A in (13) and refer to i?o = Ro(l/A) as the cutoff rate 3 unless 
noted otherwise, even though the strict definition of cutoff-rate is (13). 

Note that R 0 depends on the particular metric m(y r ,x r ) which is used by the decoder. 
If the decoder uses the maximum-likelihood (ML) metric for a memoryless channel, i.e.. 

i 

m(y,x) = — log(Pr(y|x)) = — log Pr(j/ r |x r ) 

r= 1 

l l 

= X^(-log( Pr (yrkr))) = 5Z m ( X r,J/r), 

r=l r = 1 

(13) becomes the channel cutoff-rate for the optimum receiver, which is the usual definition 
of R 0 [7]. We will denote the value of A which maximizes the (13) by A In this case, the 
Chernoff factors will be written as C(a m ,a p ) = C(a m , a p , A/^ ). 

The actual evaluation of the maximum-likelihood metric for most channels is not simple, 
however. In fact, only for the AWGN-channel does the maximum-likelihood metric assume 
a form simple enough to be implemented in decoding circuits [7]. For the AWGN-channel, 
the maximum-likelihood metric is the squared Euclidean distance between the received se- 
quence y and the transmitted sequence x, i.e., m(y, x) = ^i=i lz/r — £r\ 2 - With this metric 
(13) is minimized by setting A = A = l/(2A r 0 ), and the Chernoff factors turn out to be 
exponentials in the squared Euclidean distance, i.e., 

C(x r ,x') = e~^-< ] \ 

• 

where Es is the average signal energy. 

From this it is easily seen that a code’s performance is dominated by the two distinct 
sequences Xi and x 2 that are closest to each other in terms of squared Euclidean distance. 
Their distance is referred to as the minimum free squared Euclidean distance , or d^ ree , of the 
code, defined as 

/ 

d )ree = m j n l X l>- ~ X 2r| 2 - 

X lT tX 2 r = 1 

Figure 3 shows R Q for the AWGN-channel as a function of the ratio of the average signal 
energy Es over the average noise power A r o for a number of popular signal constellations. It is 
interesting to note that rectangular constellations fare slightly better than constant envelope 
constellations with the same number of signal points. The reason for this lies in the added 
flexibility provided by the amplitude modulation in the case of rectangular constellations. 

IV. Expurgated Error Bound 

In this section we derive an expurgated bound on the event error probability which improves 
the i?o-bound, especially for rates R significantly below R 0 . The event error probability for 

3 /2o(l/>l) is sometimes referred to as the symmetric cutoff-rate. 
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( 14 ) 


a particular correct path V c is a special case of (4), i.e., 

Pe\v c {j) < J2 C(x,x',\). 
v''ev' 

Applying the inequality (see e.g. [11]) 

i><( e-;) 17 *. o 

to (14) we obtain 


< s < 1, 


P e\V c U)< J2 C S (X,X',\). 

v'ev 1 

Following analogous steps as those leading from (5) to (7), we obtain 


P e s \v c U) < E E C a (x,x’,X) 
v/cv; 

CO 


= v e ), 

l=n 


where for memoryless channels ?r/( s, V r c ) is given by 


M 


V e ) < ZLE^2(x^x')C s (x fc ,x',A) 

t=l x' 


.4 A 


< (2*-l)2* (| -'‘ l £ EWWCh.V' 1 ) • 

\m=lp=l / 


Note that i r/(jf, s, K c ) is independent of V, i.e., 


V" c ) = 7T/(j,s) ; for all V c e V 


and 


P eV c 0') = P e0)> 


i.e., P*{j) averaged over all time-varying trellis codes is independent of the correct path 
through the trellis and averaging over all correct paths becomes trivial. We now define E(s) 

as 

A 


.4 A 


E(s) - — log 2 min ^ 'Y^p(a m )p(a p )C s {a m ,a p , A), 
and proceed to obtain 


m— 1 p~ 1 


P eU) < 

/=M 


< (2* — 1)2~ m£: ^ y, 2 kt 2~ tE ^ s ' > 


t = o 


( 2 k — l)2~^ E ^P 
1 _ 2 -(£(*)-*) 


; 0 < < £(3). 
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Pf{j) is the the event error probability P e {j) raised to the power 5, averaged over all codes 
and ail correct sequences. There must then be at least one code in the ensemble for which 
P*(j) < PeU)' Using this in the equation above we obtain an expurgated upper bound on 
the event error probability of the best trellis code in the ensemble 


P r < P?' /S < 


\i - 2-( £ ( s )- fc )y “ 


0 < k < E(s ) 0 < s < 1 , 


where we have again dropped the dummy parameter j. It is sometimes convenient to express 
this bound as a function of the memory order v m of a code. Since i/ m = /z — 1, we obtain 
Theorem 1: There exists a rate R = k trellis code, with a trellis generated by a convolutional 
encoder with register lengths = u m for 1 < i < k, using a signal constellation A = 

{a 0 , • • ■ ,a^_i} of cardinality A, whose error event probability P e is bounded above by 


P f < 


- \ 2 E i*)-R _ 1 


l/s 


E It) 


for any s and R such that 0 < s < 1 and 0 < R < E(s), where 

A .4 

E(s) = — log 2 min ^ J2p( a m)p( a p)C s {a m , a p, ty- 

' m= 1 p= 1 


V. Bounds involving the Free Euclidean Distance 


In this section we restrict ourselves to AWGN-channels, the most widely used channel model. 
All results, however, can be extended to general memoryless channels. The following theorem 
gives a strict lower bound on the average first error event probability P e , i.e., P e (j ) averaged 
over all time units j , for trellis codes used on an AWGN-channel. 

Theorem 2: The average event error probability P e of a trellis code on an AWGN-channel 
with one sided noise power spectral density Nq is lower bounded by 


1 n <* 

Pe > 'j Y^ PiQ 

*max i = \ 



where d; is the minimum normalized Euclidean distance d = min \ — — — — achiev- 

Es 

able between a particular correct sequence x and any incorrect sequence x', Es = p( a i)\ a i\ 2 
is the average signal energy , / max = max/^, where li is the minimum length (in branches) of 

i 

the error events that achieve the minimum distance d t , and pi is the probability that the mm- 
imum distance sequence pair x,x' has distance d{. n j is the number of different minimum 
distances d{ achievable in a particular code , where d\ < c?2 < • • • < d nd , and d\ — dj rec is the 
minimum free Euclidean distance of the code . 


Remark: Since trellis codes, in general, are non-linear [1-3], the minimum Euclidean distance 
di among all error paths V ' with respect to a particular correct path V depends on V. Then 
Pi is the fraction of correct paths whose nearest error path is at distance d t . 
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Proof: Assume that we want to bound the event error probability at node j (Figure 4). Let 
V'(j) be the set of paths diverging from the correct path at node j . Let x' be the signal 
sequence on a path V( E V/(j) of length /. Assume that the correct path is V. Denote the 
probability that the decoder follows V for at least / time units by P c . Further, denote the 
set of error paths diverging from V at node j + r by V\j + r), and let P e be the probability 
that the decoder chooses any path in the set £ = V*(j) U V ; (j + 1) U • • • U V'(j + / — 1), be., 
P e is the probability that the decoder diverges from the correct path before node j + L P s 
is lower bounded by 

P € > P(x x'). (15) 

This follows from the fact that eliminating all signal sequences but x' from £ allows us to 
expand the decision regions of both x and x', thus increasing P c and decreasing 1 — P c = P 6 . 

On the other hand, P e may be upper bounded tightly by 

Pe < Pe(j) + Pe(j + 1) + * * * + Pe(j + l ~ 1). (16) 

In order to proceed further, we combine (15) and (16) and average over all possible time 
units and correct paths V 4 , be., 

P(x “ x'j < PjJ) + PJJ + l) + ... + P e (j + l-l). 

Due to the linearity of the expectation operator 

Pe(j) + Pe(j + 1) + • * * + P e (i + l — 1) = P e (j) + Pe(j + 1) + * * # + P e (j + l — 1) = lP e (]), 

since the average first event error probability P e (j + r) is independent of time when averaged 
over all possible time units and correct paths. If we denote this average first event error 
probability by P e we obtain from above 

P(x ~ x') < /P e . (17) 

Note that (17) holds for any incorrect path E V ' and that / is the length of this path. 

We now also carry out the averaging on the left hand side of (17), where in each case we 
choose the incorrect sequence x' such that |x — x'| is minimized, which yields the tightest 
possible lower bound. This sequence has length /, which possibly differs from / in (17). This 
causes the dilemma that the chosen error sequences x' may not all have equal lengths / t , 
raising the question of which l to use in (17). To guarantee that the bound in (17) is not 
violated, we let / be the maximum length of the incorrect paths chosen, denoted by / max . 
For the AWGN-channel with one-sided noise power spectral density Wo, the two code word 
error probability P 2 (x — ► x') is given by 

?)■ 

4 Here the overbar denotes the averaging over the correct sequences for a particular code, not an average 
over a code ensemble as in the two preceding sections. For time invariant codes, the average is reduced to 
an average over all correct paths. 
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where |x-x'| = y/E'jJ o(*j ~ x 'j) 2 is the Euclidean distance between the two signal sequences 
x and x'. 

For some nodes and sequences x. the nearest neighbor is at distance 

= min x ,x' | x — x/ | = df ree , for some it is at distance d 2 , etc., up to some largest distance 
d n<t . Further, let /, be the minimum length of the error event that achieves d{. If we collect 
all the node error probabilities and weight them according to their probability of occurrence 
Pi, we obtain from (17) 


U x P e > F 2 ( x -h- x') 

nd / 

= J2p'Q ( d <\ 


E s 


nd 


i=l 


2 N 0 ’ 


Up. = h 


(18) 


1 = 1 


where p t denotes the probability that the nearest incorrect sequence x' is at distance d„ thus 
proving the theorem. Q.E.D. 

Note that Theorem 2 is valid for time- invariant as well as for time-varying trellis codes, 
while we had to assume time- varying codes in the derivation of Theorem 1. We now combine 
these two theorems. Using the well-known approximation of the Q-function [7, page 83] 

in Theorem 2 and neglecting all terms i > 1, we obtain 5 

2 E s / N q 


n ^ P 1 

P e > g»T eX P 


v^/r 


1 free 


In d 


free] 


E s /N 0 


+ In 1 - 


d),„E,IN 


Pi ( ,2 E s / No 

exp -d fTee 


sjlitlx 


4 


(1 + 0(E./N 0 )) , 


(19) 


where O(E s /N 0 ) is a quantity that goes to 0 as E,/Nq — * 00 . 
Specializing Theorem 1 to AWGN-channels, we obtain 


P. < 


1 — 2~ r 
2 E(a)-fl _ 1 


E( s) 


E(s) > R , 


( 20 ) 


where 0 < s < 1 and 


Els) = -log, £ 


m=l p— 1 


We thus have an upper bound (20) and a lower bound (19) on the first event error 
probability of trellis codes on AWGN-channels, and therefore 


Pi 


\f2irli 


exp 


j2 Eg/ Nq 

'“free 7 


(l + 0(£ s /iVo))) < (l-2- fl ) i exp E(s) — ^ In (2 £(s) " R 


5 This also allows us to set l max = l\. 
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where df ree is the normalized minimum free Euclidean distance of the best trellis code, since 
the upper bound is for the best trellis code (Theorem 1). We may take the natural logarithm 
of both sides to obtain 


- ln(>/§5r/i/pi) - d 2 Jree ^j-^ ( 


1 + O(E s /N 0 )) < - (ln2 ^ -£(^) - ~ In (2 E(s) ~ r - l) 

s s 

+ i l n (1 - 2-«) 


d 2 free (l + O(E 3 /N 0 )) > 


4(ln 2)u„ 
sE./N o 
4 




_ ... In (l -2~ r ) - 
sE 3 /Nq ' ' 

For simplicity, let us denote iEdhi by Q Then we obtain 


sEJN o __ 

4ln(v^rfi/pi) 


E./N o 


d 2 fTet (l+0(E s /No))> 


(In 2 )u m E(a) 0 (E(a)) 4 In (yggi/pi) 


+ 


a 


a 


E./No 


(211 


where 


Q(E(a)) 

E(a) 

0 


= ln(2 E ^- H -l)-ln(l-2- fl ) 

= -log 2 £ Ep(^Ma P )e-“ t — 

m = l p=l 

< n< E ‘t N ” 

“ 4 


We can now obtain a lower bound on the minimum free Euclidean distance df ree of the best 
code by letting E s /N 0 — ► oo in (21). This gives us the same bound derived in a different 
fashion by Rouanne and Costello [8], i.e., 


d 


free — 


max 

Q>0 


dn(2K£(a) ( Q(E(a))' 


a 


a 


E(a) > R. 


( 22 ) 


On the other hand, (20) can be written as 


Pe< 2 


— Urn Ecx 


(23) 


where from the definition of a the expurgated exponent E ex is given by 

EslN 0 

1 ~ — ■+■ 7EEE7. Z I 

0 <a< 


JE A 
^ ex 


max 
■ E s/ N 0 

4 


E{a) > R. 


(24) 


If the maximizing value of a in (22), q max , i s smaller then (E 3 / A r 0 )/4, maximizing the 
minimum free Euclidean distance is the same as maximizing the expurgated error exponent 
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For large i/ m , the contribution of the term ©(£(<*)) in (22) and (24) becomes negligible, 
and we may form the asymptotic expurgated error exponent 


p oo A 
^ ex 


max 

0<a <^Ha 


E(a) Es/Nq'' 


a 


E{a) > R, 


(25) 


and the bound of (22) becomes 

^Jree _ goo j _ ^( a max) ( 26 ) 

(ln2)i/ m ex Es/N 0 a m ax 

where ct max is the value of ot which maximizes (25) and d?j ree j v m is the asymptotic distance 
growth rate. If o max ^ (Es/N o)/4, then a signal constellation that maximizes the bound 
on the free distance will also maximize the expurgated error exponent. If, however, ci ma x ^ 
(E s /iV 0 )/4, then the error bound (23) reduces to 

p ' s R ° > R • (27) 


where 

Rc = - log, t tpMpM c ~^ E ‘ /K ° (28) 

m— 1 p= 1 

is the cutoff-rate of the constellation on an AWGN-channel, and no expurgated error bound 
exists. 

Maximizing E ^ is the same as maximizing the function E(a)j a, which is accomplished 
easily with the help of the following lemma, which is proved in the appendix. 

Lemma 3: E(a)/a is a monotonically decreasing function of a. 


Since E(a)/a is a monotonically decreasing function of a, (25) achieves its supremum at 
the smallest value a such that a > 0 and E(a) > R. Since E(a ), on the other hand, is a 
monotonically increasing function of a, a max is the smallest value of a such that E(a) > R 
and is given by the implicit equation 

E(c w) = R = -log 2 ( X] Ep( a ™)p( fl p) e '°"“ k ' ap|! • < 29 ) 

\m=lp=l / 


VI. A Geometric Construction 

We now show how E ~ and a max can be constructed from a graph of the cutoff rate R 0 . As 
an example consider the 8-PSK constellation whose cutoff-rate R 0 in bits/signal is shown 
in Figure 5 (dotted line). When E a /N 0 > 4a max , (25) implies that E ~ is a linear function 
of E s /N 0 and, as can be seen from (29), its slope F^oWxV^max depends only on the rate 
R for a fixed constellation. As E,/No — » 4o max from above, E “ — * Ro . The higher the 
available energy, i.e., the larger E s /N 0 is for a particular R, the larger E ~ will be. In Figure 
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5, E™ is shown as solid lines over the range where (23) exists for several values of R. For a 
code with a larger value of R, the expurgated exponent grows more slowly with E s /N 0 and 
a larger E s /N 0 is required for the expurgated bound to exist. 

With these preliminaries, E™ as well as the asymptotic distance growth rate, can easily 
be constructed from a graph of the cutoff-rate Ro . This construction is also illustrated in 
Figure 5. 

Construction of the asymptotic expurgated error exponent E ~ from the cutoff-rate R 0 : 

1. Choose the value of the code rate R. The cutoff-point is the intersection of a line 
a distance R above and parallel to the E,j iVo-axis with the cutoff-rate curve. The 
x-value of the cutoff-point is 4 a max * 

2. Draw a straight line g through the origin of the graph and the cutoff-point. 

3. The expurgated exponent for any EJN 0 > 4a max is the y-value of g at that value of 

EJ N 0 . 

The asymptotic bound on d) ree from (26) is 4(ln2)i/ m times the slope of g. 

We should note the importance of Ro at this point. If a constellation C\ has a higher 
value of Ro than constellation C 2 for some range of the signal-to-noise ratio E s /N 0 , then 
is evident from the above construction that trellis codes using constellation C j, at a rate 
R such that 4 o max (the x-value of its cutoff-point) falls into that range, will have a larger 
expurgated error exponent E ex as well as a larger asymptotic bound on the achievable free 
Euclidean distance dj Tee than trellis codes using constellation C 2 - The merit of a constellation 
in conjunction with trellis codes can therefore be judged on the basis of its cutoff-rate Ro, and 
it is not necessary to evaluate either the expurgated bound or the bound on the minimum 
free Euclidean distance. 

A constellation can now be optimized for Euclidean distance as well as event error prob- 
ability by optimizing its cutoff-rate. Consider the upper envelope of the cutoff-rate curves 
for a set of possible signal constellations. Then using the above construction, the desired 
code rate R determines the constellation with the best cutoff-rate. This constellation then 
optimizes the Euclidean distance and the event error probability for this code rate R. 

As an example of constellation optimization we have numerically optimized a pulse am- 
plitude modulation (PAM) constellation with 8 signal points in Figure 6. It is interesting 
to see that for very small signal-to-noise ratios, E s /N 0 < ldB, the resulting constellation 
is in fact only 2-valued (BPSK). For larger E s /N 0 , successively more signal points move 
away from the clusters to form higher-sized constellations. At values of Es/N 0 > 13dB, the 
constellation with uniform spacing (8-PAM) becomes optimal. 

This optimization gives a cutoff-rate gain of up to a factor 2 (3dB) in Es/N 0 , as shown 
in Figure 7. This may be important for the construction of trellis codes for very low Es/N 0 
applications. It also confirms the well accepted observation that small-sized constellations 
are preferable for small values of Es/No. The optimization of a PAM constellation with 4 
signal points gives similar behavior, with much smaller gains in Es/N 0 , however. 
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It is not hard to show that the regular, unit-energy constrained, 2-dimensional constel- 
lation with 4 signal points (QPSK) is optimal in the above sense for all values of Es/N o* 
We have further observed numerically that the corresponding optimal circular constellation 
with 8 signal points is also regularly spaced (uniform 8-PSK). 

VII. Conclusions 

We have presented an expurgated bound on the first event error probability of trellis coded 
modulation on AWGN-channels. The asymptotic form of this bound is equivalent to known 
bounds on the minimum free Euclidean distance. The expurgated form of the bound gives, 
however, more information since it does not require an infinite signal-to-noise ratio to eval- 
uate. The expurgated bound is a linear function of the signal-to-noise ratio and a simple 
construction, based on R 0 , has been presented. The bound can also be used as a means of 
comparing different signal constellations. 
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IX. Appendix: Proof of Lemma 3 

We must show that ^ is a monotonically decreasing function of a. Let F(a) = Then 


F(a + e) = 


1 


a + £ 

\ 

1 a 
a a + £ 


( A A / 

-log2 Y Y eX P ( ( a + e) 

y m— 1 p=l V 

A A 


- % 


A A / 

Y Y P(flm)PUp) eX P 


j— m - % 


m=l p=l 


= log 2 f Y Yp^™)p{o. p ) exp ((a + e) |a m - « P | ) j 


a + « 


We now use Jensen’s inequality (see, e.g., [12, appendix B]) for the special case X & < A 
w r ith 0 < 1, where the overbar denotes expectation, and obtain 


-0 


A ,4 


F(a + e) < --log 2 (EEp(OpU P )exp 

im=l p=l 


f q(q + e) 

, (a + t) 


(Lm Qp 


1 A A / 2\ 

< log 2 Y Y p(£m)p{a p ) exp UL-fip ) 

<* m = lp=l V ' 

< F(a). 

Thus F(a ) is monotonically decreasing, which proves the lemma. 


Q.E.D 
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Figure Captions 


Figure 1: Trellis coded modulation using quadrature modulation. 

Figure 2: A correct and incorrect path pair through a trellis. 

Figure 3: Cutoff- rates for different signal constellations for the AWGN-channel. 
Figure 4: Some error paths diverging from the correct path at node j . 

Figure 5: The cutoff-rate Ro and the expurgated error exponent of 8-PSK. 
Figure 6: Optimized 4-PAM and 8-PAM constellations for several rates R. 
Figure 7: Optimized QPSK and 8-PSK constellations for several rates R. 
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Figure 1: Trellis coded modulation using quadrature modulation. 










Figure 4: Some error paths diverging from the correct path at node j. 
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Figure 7: Cutoff-rates of the uniform 8-PAM and the optimized 8-PAM constellation. 
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Introduction 


In concatenated coding systems using Reed-Solomon (RS) outer 
codes over GF(2 b ) and ideal symbol interleaving between the 
inner and outer code, the system bit error rate (BER) is closely 
approximated by 


Pb 


2t + 1 * N 

N l Jr + 1 i i 


p;( i - p.) 


N-i 



~~ ^ -J 

where is the symbol error rate (SER) out of the inner decoder. 


Thus, for a given outer code the performance measure of the 
inner code is the SER and not the BER. 


For convolutional and trellis inner codes, simulation is gener- 
ally used to obtain P s . The byte size of the outer code symbol 
requires a very large number of bits to be simulated for statis- 
tically valid points. 

Goal : Find an analytic method for determining P s for convolu- 
tional and trellis codes. (Onyszchuk and McEliece for 1/n convolu- 
tional codes) 


Conceptual Motivation 

Reinterpret the union bound on the probability of bit error for 
a convolutional code given by 

1 °° 

Pb = T E B d P d (2) 

k d =: dfree 


where B d is the total number of nonzero information bits on all 
weight d paths and P d is the two codeword error probability. 

Traditionally, this bound is developed by considering a trun- 
cated trellis as a block code and then computing the average 
number of information bit errors per decoded information block. 
Another derivation of this bound is useful. 

Example: r = 1/2, m = 2, convolutional code with the following 
feedforward encoder realization and trellis. 




o 


An error event of weight d occurs with probability P t By simple 
counting, this error event can cause the information bit on the 
jth branch to be in error in precisely Bd = 2 ways. Where Bd is 
the number of nonzero informaton bits on the incorrect path. 

♦ Thus, the BER due to this error event , denoted Pb d , is 

Pb d = BdPd ( 3 ) 

- Summing over all possible error events yields 

— CO 

n= e B d p d (4) 

d=d f rte 

The counting technique used to determine the upper bound on 
Pb can be extended to bound the SER out of the inner decoder. 


• Example: r = 2/3, m = 1 convolutional code with a feedforward 
encoder and the following trellis and with 4-bit symbols for the 
outer RS code. 



Assuming that symbol boundaries are always aligned with trel- 
" lis nodes, the error event shown can cause the particular 4-bit 
symbol to be in error in 3 ways, each occuring with probability 

~ Pd- 

v Thus, the probability of symbol error due to this error event is 

P * a d,i = 


( 5 ) 




An error event of length l branches and weight d, can cause 
an error in at most (b/k + l — m — 1) ways, each occuring with 
probability P d . 

rry ZEftO Bft ANCUES 

( t 

y [ I l 1 ■ 

£ BRANCH E-ftftOrR EVENT 

b ^>XT SYhCSOL ■ 

BRANCHES 

The — m term is due to the fact that in feedforward realizations, 
all error events end with m consecutive 0 branches. 

Summing over all error events of all lengths gives, 

OO CO 

P s < E E (b/k + l-m-l)A d jP d (6) 

d=d free l=m + 1 

where A d j is the number weight d paths with length 1. This can 
be simplified to 


. CO OO 

p, < (b/k - m - 1) £ AiPi + £ L d P d (7) 

d—dfree d—df ree 

where 

OO 

L d = E lA d ,i (8) 

l=m + 1 

is the total length in branches of all weight d paths. 
Performance Factors: 

1. Path multiplicity, is dominant. 

2. Degree of byte orientation, b/k. (Lin-nan Lee) 

3. Length of the error events. (Simon and Divsalar) 



• In terms of the code transfer function 


- P. < K(d f ) f( b/k - m - 1 )T(D, L, /) + OT( ) | L=I=l (9) 

D=exp(-E s /N 0 ) 

• In systematic feedback encoder realizations, error events cannot 
end in all zeroes branches. Thus, the bound becomes 

P. < K(d f ) {(b/k - 1 )T(D,L,I) + dT{D g L L,I) ) j | Lml=l (10) 

D=ex p(-E s /Nq) 


Trellis Codes 


• For appropriate trellis codes, a bound on the SER can be ob- 
tained using the Zehavi and Wolf transfer function. 

• For the LxMPSK codes constructed by Pietrobon, et.al. and 
Ungerboeck, systematic feedback encoders are used and the 
bound becomes 

P. < K(d f ) ((b/k - 1 )T(W, L, /) + 7) ) | L=I=1 (11) 

W=exp(-EJIN 0 ) 



Symbol Error Rate (SER) 


Simulation vs. Bound 
Ungerboeck v = 2, 8PSK Code 



Bound 
■ Simulation 


Eb/No (dB) 




Symbol Error Rate (SER) 


Simulation Results 

Multi-Dimensional 8PSK Trellis Codes 



° 8PSK, v=2 
° 2x8PSK, v=2, q=l 
* 4x8PSK, v=2, q=3 


Eb/N 0 (dB) 



Conclusions 


• An upper bound on the SER for convolutional/trellis codes can 
be obtained using a transfer function approach. 

• Feedforward realizations of a particular code may perform bet- 
ter than the feedback realization of the same code in concate- 
nated systems. 

• For concatenated systems, it may be better to design the inner 
code to have a short dj ree path, i.e. to design the df ree path to 
be a parallel transition. 

• Multi-D trellis codes with byte oriented branches do not im- 
prove in SER compared to 2D Ungerboeck codes because of 
high path multiplicities and dense spectra. 
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Trellis Codes With Linear Parity Check Equations. 


• For rate k/(k + 1) trellis codes, the parity check equation 
defines the relationship between the k + 1 binary output 
sequences y°(D),y l (D ), . . . , y k {D). 

• A linear parity check equation: 

H k (D)y\D) ® • • • © H\D)y\D) © H°(D)y°(D) = 0(D) 

where H l (D ) = parity check polynomial of y l (D ). 

0(F)) = all zeros sequence. 

• The constraint length (u) of an encoder is the maximum 
degree of all W(D ), i.e., 

v — max deg H\D) 

all i 

• The memory (m) of an encoder is the number of delay 
elements required to implement an encoder. 

• For linear codes it can be shown that m = v. 

(See Forney, “Convolutional Codes I”, IEEE Trans, on 
Inform. Theory , November 1970). 



• The integer representation of y l {D ), for 0 < i < k, is defined as 

y(D) = y°(D) + 2y i (D) H h 2 k y k (D) 

= jbw(D) 

2 = 0 

• We define a naturally mapped signal set as a signal set 
such that a discrete phase rotation of the signal set 
produces a rotated sequence y r (D) 

y r {D) = y(D) + 1(D) (mod M) 

where 1(D) = all ones sequence 

M = 2 k+l 


Example: MPSK 


1 0 
y y 


010 


01 • 

• 00 


“ — » 

10 • 

• 11 


r\ 2 10 

Q y y y 


001 


oil • 


* i 


100 • 


101 


4PSK 


• 000 
— * I 
• 111 


• 110 


8PSK 
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Special Case: Naturally Mapped 16 QAM 


Here 

with 


, ioji 

ono ' 

' ijh 

igio 

11Q0 

oogi 

ogoo 

0J01 

oip 

00J0 

ogn 

ipo 

logo 

ngi 

OJOO 

igoi 

y(D) = y°(D) 

+ 2 y 1 

(D) 



y r {D) = y(D) + 1(D) (mod 4), 
Vr( D ) = v 2 ( D ) and yl(D) = y 3 (D). 
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Systematic Encoding 


For a systematic encoder we let 

y l (D) = x\D) 
y 2 {D) = x 2 (D) 

y k (D) = x\D) 


Example of Systematic Encoder with 1 / = 3 and k = 2 (rate 2 / 3 ). 


x 2 (D) 

x'(D) 
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Effect of Phase Rotation on Linear Parity Check 

Equations 


• With natural mapping we have 

y° r = y° © 1 = y° 

yl = y l © y° 
y; = y 2 © y° • y l 

Vr = y k © n* v' 

j=0 

• On a phase rotation the parity check equation becomes 


H k (D)y k ;(D)eH k - l (D)y L ;- 1 {D)e---®H 1 (D)y 1 r (D)(BH o (D)y o r (D) = 0(D) 


H°(D)y° r (D) 


= H°(D)(y°(D)®l(D)) 

= H 0 {D)y°(D)®H <> {D)l(D) 

= H°(D)y 0 (D)®E[H 0 (D)](D) 


where E[H°(D)] is the modulo-2 number of non-zero terms in 
H°{D), e.g., E[D 5 ©D 4 © D 3 © D 2 ] = 0. 
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If E[H°(D)\ = 0, then H°(D)y° r (D) = H°{D)y°(D). 


H\D)yl(D) = H l (D)(y 1 (D)®y°(D)) 

= H l {D)y l (D) ffi H 1 (D)y°(D) 
* H\D)y\D ) 


Thus linear parity check equations are not phase transparent. 



A Parity Check Equation Not Affected by a Phase 

Rotation 


• Assume that E[H°(D)\ = 0 . Let 


z(D) = (£> a + (M- l)£> 6 )?/(£>) (mod M) 


where v > a > b > 0 . 


• On a phase rotation 

2 , -(D) = (D a + (M- l)D b )y r (D) (mod M) 

= (D" + ( M — 1 )D b )(y(D) + 1(D)) (mod M) 

= ( D a + (M- 1 )D b )y(D) + ( D a + (M — l)D b )l(D) (mod M) 


• Note that D l l(D) = 1(D) for all integers i. Thus 

z r (D) = z(D) + 1(D) + (Af - 1 )(D) (mod M) 

— z(D) + M{D) (mod M) 

= z(D) 


• Thus all the bits in z(D) are unaffected by a phase rotation. 


8 



Note that the most significant bit of z(D) is a function of 
all y t (D) i satisfying the requirement that these bits are 
checked by the encoder. 

We have that 

z(D) = z°(D) + 2 z l (D) + • • • + 2 k z k (D) 

and 

H°{D) = D v © hl~ l D v - y © • • • © h^D 2 0 h\D © 1. 


We form the parity check equation 

z k (D) © hl_ lZ k -\D)® • • • © h\z\D) © H°(D)y°{D) = 0 (D) 


z k (D) is always selected, since it checks all input bits 
(thus avoiding parallel transitions). 

h\ are used to select other bits of z(D). 

z°(D) is not selected since it is a linear function 

of y°(D) (which is taken care of by H°(D)y°(D) 
in the parity check equation). 


In implementing an encoder, need to determine z l (D) in 
terms of y°(D), y l (D ), . . . , y k (D). 
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. If E[H°(D)} = 1 , we let 


z(D) = (D° + (M/2 - 1 )D b )y(D) (mod M) 


• With this form of z(D) we have 

-AD) = (D a + (M/2-l)D b )(y(D) + l(D)) (mod M) 

= (D“ + (M/2 — l)D b )y(D) + 1(D) + (M/2 — 1)(D) (mod M) 
= z(D) + (M/2)(D) (mod M) 


• Thus we have z' r (D) = z'(D) for 0 < i < k - 1 (i.e., the first k 
least significant bits of Z(D) are unaffected by a phase 
rotation) but 

z k ;(D) = z k (D) © 1(D). 


• Since z k (D) is always selected, the 1(D) term generated by 
z k {D) will cancel the 1(D) term generated by H® (D)y°(D). 

• We can also have other forms of z(D), as long as z k (D) checks 
all the bits in y(D). For example (with E[H°(D)\ = 0), 

Z {D) = (D a + SD b + 4D c )y(D) (mod 8). 
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- Example: Rate 1/2 QPSK (M = 4) 

_ • The input sequence 

x(D) = x l (D) 

and the output sequence 

y(D)=y°(D)+2y 1 (D). 

• We have 

z(D) = (D a + 3D b )y(D) (mod 4). 

- • We need to express z(D) in terms of y°(D) and y l (D): 

z(D) = (D a + W b )y(D) (mod 4) 

z(D) = (D“ + D b + 2D b )(y°(D) + 2y l (D)) (mod 4) 

- z°(D) + 2z 1 (D) = {D a + D b )y\D) + 2({D a + D h )y l (D) + D b y\D)) 

• For a two bit binary adder 

s = e ® / © Ci 

Co = e • / © C; • (e © /) 

_ • Thus Z °(D) = (D a © D b )y°(D) (not used) 

z\D) = (D a © D 6 )j/ 1 (D) © D b y°(D) © D a y°(D) ■ D b y°{D) 

= ( D a © D^y'iD) © D“y a (D)D b y°{D) 

• The parity check equation becomes 

(D a © D b )y\D) © D“y°{D) ■ D b y°(D) © H°(D)y°{D) = 0(D) 
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Example of Rate 1/2 Systematic Encoder With 

Feedback 


• We have v = 3 , a = 2, and 6=1, which gives the parity 
check equation 

(D 2 © D)y\D) © D 2 y a (D) • Dy°(D) © (D 3 © /15D 2 © h\D © 1 )y°(D) = 0(D) 



• Example of rate 1/2 encoder with v = 4, a = 3, and 6 = 1. 



• For \a - b\ > 2 the encoder may not be minimal or H°(D) 
may need to be restricted. 
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Example: Rate 2/3 8PSK (M = 8) 


• We have 

z(D) = (D a + 7 D b )y(D) (mod 8) 

• Expressing z(D) in terms of y°(D),y l (D), and y 2 {D) 

z(D) = (D“ + D b + 2D b + 4D b )(y°(D) + 2y l (D) + 4y 2 (D)) (mod 8) 
z(D) = (D a + D b )y\D) + 

2((D a + D b )y l (D)+D i y°(D)) 

4 ((D a + D b )y 2 {D) + D b y\D) + D b y°{D)) (mod 8) 

• Using two bit logic adders 

o 



- z\D ) = (D“®D b )y 2 {D)®D\y\D)®y\D)) 

®D a y 1 (D) ■ D b y 1 (D) ® D a y°(D) ■ D b y°(D) ■ ( D a ® D b )y 1 (D) 
®D b y\D ) • (( D a ffi D b )y\D ) © D a y°(D) ■ D b y°(D)) 
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- • Example of rate 2/3 encoder with v = 3, a = 2, and 6=1. 

• Parity check equation: 

- ’ (D 2 © D)y-(D) © w 2 {D) © h\z\D) © (D 3 © h 2 0 D 2 © h\D © 1 )y°(D) = 0(D) 

y 1 ©) 
y°(D) 

• Note that encoder is not minimal. 
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Conclusions 


• Trellis codes based on linear parity check equations are 
not rotationally invariant. 

• A general parity check equation for rotationally invariant 
trellis codes has been presented. 

• A method of finding an encoder implementation for these 
codes has been given. 

• Not all rotationally invariant codes are minimal. Rate 
k/(k + 1) codes with two or more checked bits are not 
minimal. 

• Method can be applied to all signal sets with phase 
symmetry by appropriately mapping points in the signal 
set. 

• Since codes are non-linear, a systematic code search 
involves searching all paths to find the free distance. 
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June 1989 

Submitted to the 
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Abstract 

A method of finding good trellis codes with multi-dimensional (multi-D) QAM modu- 
lation is presented. Using the 16QAM signal set, 4-D, 6-D, and 8-D QAM signal sets are 
constructed which have good partition and phase rotational properties. 

The good partition properties are achieved by the use of block codes and their cosets 
restricting each level in the multi-D mapping. The rotational properties are achieved through 
the use of a naturally mapped 16 QAM signal set. This signal set has the property that, 
of the four bits used to map the signal set, only two bits are affected by a 90° phase rotation. 
With an appropriate addition of the coset generators, the multi-D signal sets also have two 
mapping bits affected by a 90° phase rotation (the remaining bits being unaffected). 

This implies that many good rate k j (Ar-f-1 ) trellis codes can be found for effective rates be- 
tween 3.0 and 3.75 bit/T and that are 90° or 180° transparent. The results from a systematic 
code search using these signal sets are presented. 


‘This work was supported by NASA Grant NAG5-557. 


TRELLIS CODING USING MULTI DIMENSIONAL QAM SIGNAL SETS 

by 

Steven S. Pietrobon and Daniel J. Costello, Jr. 

June 1989 

Submitted to the 

1990 IEEE International Symposium on Information Theory 


Summary 

A systematic method ol finding good trellis codes using multi-dimensional QAM signal 
sets is presented. An important part of these types of trellis codes is in the construction of 
the multi-dimensional signal sets. 

The method used is very similar to that in [1] in which multi-dimensional MPSK signal 
sets were constructed. That is, we start with a 2-D signal set with M = 2 1 points and form 
a partition chain such that the minimum squared subset distance (MSSD or Sf) at partition 
level i is as large as possible. The partition starts at partition level 0 with the whole signal 
set. dividing each set in two until we are left with M subsets of one point each at partition 
level /. With rectangular signal sets, it is easily shown that 6h. = 2 6 2 for 1 < i < I - 2 and 
S] = oo. 

The next step in forming multi-dimensional signal sets is to take the cartesian product of 
L of these 2-D signal sets to form a 2L-dimensional (2L-D) signal set and find a partitioning. 
This is achieved by the use of coset generators which are found from the partitioning of 
binary block codes. If the 2-D signal set is naturally mapped, a multi-D signal set mapping 
can be found which has at most / bits affected by a phase rotation out of the total of IL 
bits used to map the multi-D signal set. 

A 16QAM signal set is presented which has these properties. It is shown that only the 
two lsb s are affected by a 90° phase rotation, while the two msb’s are unaffected by a phase 
rotation. This signal set is then used to construct 4-D, 6-D, and 8-D QAM signal sets which 
have only 2 bits affected a phase rotation out of the 4L bits used to map the signal set. 

Since the multi-D signal sets have only 2 bits affected by a 90° phase rotation (due to the 
way they are constructed) many of the trellis codes that are found are rotationally invariant 
to 90° phase rotations. 

[l] S. S. Pietrobon, R. H. Deng, A. Lafanechere, G. Ungerboeck, and D. J. Costello, Jr., 
Trellis coded multi-dimensional phase modulation’, IEEE Trans. Inform. Theory , to ap- 

pear. 



Appendix E 

Erasurefree Sequential Decoding and 
Its Application to Trellis Codes 


Erasurefree Sequential Decoding 
and Its Application to Trellis Codes* 


Fu-Quan Wang 
Daniel J. Costello. Jr. 

Dept, of Elec, and Comput. Engr. 
University of Notre Dame 
Notre Dame, Indiana 46556 

presented at 

1990 International Symposium 
on Information Theory 
San Diego, California 

January, 1990 


'This work was supported by NSF Grant NCR 89-03429 and NASA grant NAG 5-557. 


1 


OUTLINE OF PAPER 
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• Performance Results 
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Why Sequential Decoding ? 


• The Viterbi Algorithm (VA) is practical for decoding 
convolutional codes with small constraint lengths u. 

• Performance (free distance df ree ) is limited due to 
small v. 

• Sequential Decoding (SD) can be used with any value 
of v . 

• Better performance (df ree ) can be achieved with 
larger v. 



Problems with Sequential Decoding 


• SD’s computational effort is a random variable. 

• Therefore, some information may be lost due to over- 
flow of the decoder input buffer. 

• This results in an erasure probability for SD typically 
on the order of 10 -2 to 10 -3 (Lavland and Lushbaugh). 

• Complete (erasurefree) decoding may be impossible 
if a feedback channel is not available. 


Goal of This Research 


• Propose erasurefree SD algorithms which perform 
better than the VA and have lower computational re- 
quirements. 


• Investigate the application of SD to Trellis Codes. 

• Some results using conventional SD algorithms with 
Trellis Codes have been reported by Pottie and Taylor. 



Conventional Sequential Decoding Algorithms 


• The Fano Algorithm (FA) requires little storage. 

• The Stack Algorithm (SA) decodes faster at higher 
code rates. 

• The M- Algorithm (MA) achieves the performance 
of the VA for asymptotically large SNR. 

• The FA requires the least complexity cost to achieve 
:he same performance (for a BER around 10~°). (Ander- 
son and Mohan) 

• The FA is prefered in most practical implementa- 
tions. 



Erasurefree Sequential Decoding 

• Assume that the information sequence is divided into 
frames of length L. each terminated by a string of v ze- 
roes. 

• Erasurefree algorithms require that a computational 
limit Cu m be specified for each frame such that: 

(1) . If the number of computations C < Ci im , a con- 
ventional sequential decoding algorithm is used. 

(2) . If C > Cii m , a suboptimal decoding algorithm 
which guarantees complete decoding of the frame is used. 

Examples 

{ 1 ) .The Multiple Stack Algorithm (MSA. Cheviilat and 
Costello )- 

• Uses one large stack and several smaller stacks. 

• Once the mam stack is filled, the T best paths are 
transfered to a secondary stack. 

• Once a secondary stack is formed, the decoder can 
never back up beyond the initial nodes in that stack. 

• Additional secondary stacks are formed as needed. 

( 2). The Erasurefree Fano Algorithm (EFA. new)- 

• A predetermined computational limit is set. 

• Once this limit is reached, the decoder jumps to the 
deepest node it has examined thus far (the deepest node 
must always be stored). 

• Decoding resumes at this node and can never back 
up beyond this node. 

• This process is repeated as many times as needed, 
but each with a smaller computational limit. 


Performance Comparison of the MSA, EFA, 
and VA 



SNR (dB) C E Va/.) 

Problems with the MSA and EFA 

• Although it is bounded, the number of computations 
is still a random variable. 

• The maximum number of computations per frame. 
Cmax, must be large if good performance is desired. 

• In order to guarantee erasurefree decoding with a 
finite buffer, a large speed factor (i = C max /(L + v) is 
required (say, /i > 60 for MSA or fi > 150 for EFA). 
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The Buffer Looking Algorithm (BLA) 


Diagram of the BLA 


buffer 

B 



• The input buffer of the decoder is divided into I\ 
sections. 

• Cu m (j) is a computational limit corresponding to 
the j-th section of the buffer. 

• The decoder continuously monitors the buffer state 
j ( number of occupied sections in the buffer). 

• If C < Cu m (j ), the BLA works exactly like the FA. 

• If C > Cn m (j), the BLA works exactly like the EFA. 

• If all buffer sections are occupied, the decoder changes 
parameters (bias) to guarantee the frame is decoded be- 
fore the buffer overflows. 


s 





Erasurefree Decoding Conditions for the BLA 


• Let B be the size of the buffer 

• Let Bk be the size of the last section of the buffer. 

• Let (i be the speed factor of the decoder. 

• B > L + v. 

• B k > [L 4 - v)/fJL. 

• Cu m (K) < (/i - 1 )(L + v). 
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Influence of Parameters on Performance 


• Number of buffer sections: 

Fewer sections allow larger computational limits in each 
section. (2 is best). 

• Buffer size: 

Larger buffer size allows more frames to be decoded op- 
timumly. 

• Speed factor: 

Larger speed factor implies more computations are avail- 
able. 

• Frame length: 

More data may be decoded suboptimumly for long frames. 


in 




cut-off rate Ro (bits/symbol) 


Sequential Decoding of Trellis Codes 

• Cut-off rate for two-dimensional signal constellations: 


i? 0 =2 log 2 K- log 2 {^ 0 CjC 0 1 

r {a x l -ax J ) 2 +( Q d ~ a d) 2 ' 

ex PL 8cr* 


• For 8-PSK. Rq—2 bits/svmbol when SNR=7.6 dB. 



0 1 2 3 4 5 6 7 8 9 1011121314151617181920 


SNR (dB) c Va/J 
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Metric and Threshold Increment 
for Trellis Codes 


• Branch Fano metric: 


L(m,y z ) = log 2 a e xp(-H^-* 

V 1 Tj J O ^ <y\( nr* .A ( I 'i # ♦ nr* * * • ^ 


-3R 


^k=iP( x ki) ex p(-\\yi- x ki\r /tv 2 


• Unsealed threshold increment A should be chosen 

between 3 to 5 for trellis codes (determined by experi- 
ment). 



Quantization Schemes 



• More than 5 bit circular and 8 bit rectangular quan- 
tizations are virtually equivalent to 5 bit circular and 8 
bit rectangular respectively. 
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Tail Mapping Must Be Changed for 
Frame-type Decoding of Trellis Codes 





• Ungerboeck 8 - state (v = 3) code. 

• Tail begins at point X and we assume no noise occurs 
after X. 


• Path 1 is the correct path. 

• Branch y is corrupted by noise, which makes the 
decoder follow path 2 (an incorrect path). 

• The noise level: 

\/A 0 2 + Ai 2 /2(= 0.8) < |n| < d fT J 2(= 1.1). 

• Natural mapping cannot correct the error in a one 
constraint length tail. 

• This kind of error will dominate in many cases. 

• Only 0 and 1 are possible signals in the tail (00X). 

• Change the mapping in the tail to achieve a larger 
distance between signals 0 and 1. 




Conclusions 


• Erasurefree sequential decoding algorithms can per- 
form better than the VA with less computational effort. 


• SD can work for trellis codes as well as convolutional 
codes. 

• More than 1 dB gain over the VA can be achieved at 
a BER of 10“° when the BLA is applied to trellis codes 
(Porath and Aulin code). 
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